WikiSent: Weakly Supervised Sentiment Analysis through Extractive Summarization with Wikipedia
نویسندگان
چکیده
This paper describes a weakly supervised system for sentiment analysis in the movie review domain. The objective is to classify a movie review into a polarity class, positive or negative, based on those sentences bearing opinion on the movie alone. The irrelevant text, not directly related to the reviewer opinion on the movie, is left out of analysis. Wikipedia incorporates the world knowledge of movie-specific features in the system which is used to obtain an extractive summary of the review, consisting of the reviewer’s opinions about the specific aspects of the movie. This filters out the concepts which are irrelevant or objective with respect to the given movie. The proposed system, WikiSent, does not require any labeled data for training. The only weak supervision arises out of the usage of resources like WordNet, Part-of-Speech Tagger and Sentiment Lexicons by virtue of their construction. WikiSent achieves a considerable accuracy improvement over the baseline and has a better or comparable accuracy to the existing semisupervised and unsupervised systems in the domain, on the same dataset. We also perform a general movie review trend analysis using WikiSent to find the trend in movie-making and the public acceptance in terms of movie genre, year of release and polarity.
منابع مشابه
Microsummarization of Online Reviews: An Experimental Study
Mobile and location-based social media applications provide platforms for users to share brief opinions about products, venues, and services. These quickly typed opinions, or microreviews, are a valuable source of current sentiment on a wide variety of subjects. However, there is currently little research on how to mine this information to present it back to users in easily consumable way. In t...
متن کاملExtractive Based Automatic Text Summarization
Automatic text summarization is the process of reducing the text content and retaining the important points of the document. Generally, there are two approaches for automatic text summarization: Extractive and Abstractive. The process of extractive based text summarization can be divided into two phases: pre-processing and processing. In this paper, we discuss some of the extractive based text ...
متن کاملHybrid Approach for Single Text Document Summarization Using Statistical and Sentiment Features
Summarization is a way to represent same information in concise way with equal sense. This can be categorized in two type Abstractive and Extractive type. Our work is focused around Extractive summarization. A generic approach to extractive summarization is to consider sentence as an entity, score each sentence based on some indicative features to ascertain the quality of sentence for inclusion...
متن کاملSemi-supervised extractive speech summarization via co-training algorithm
Supervised methods for extractive speech summarization require a large training set. Summary annotation is often expensive and time consuming. In this paper, we exploit semi-supervised approaches to leverage unlabeled data. In particular, we investigate co-training for the task of extractive meeting summarization. Compared with text summarization, speech summarization task has its unique charac...
متن کاملIncorporating Content Structure into Text Analysis Applications
In this paper, we investigate how modeling content structure can benefit text analysis applications such as extractive summarization and sentiment analysis. This follows the linguistic intuition that rich contextual information should be useful in these tasks. We present a framework which combines a supervised text analysis application with the induction of latent content structure. Both of the...
متن کامل